Model-based STFT phase recovery for audio source separation

نویسندگان

Paul Magron

Roland Badeau

Bertrand David

چکیده

For audio source separation applications, it is common to estimate the magnitude of the Time-Frequency (TF) representation of each source. In order to recover a time-domain signal from a spectrogram for instance, it then becomes necessary to recover the phase of the corresponding complex-valued ShortTime Fourier Transform (STFT). Most authors in this field choose a Wiener-like filtering approach which boils down to using the phase of the original mixture. In this paper, a different standpoint is adopted. Many music events are partially composed of slowly varying sinusoids and the STFT phase increment of those frequency components takes a specific form. This allows phase recovery by an unwrapping technique once a short-term frequency estimate has been obtained. Herein, a whole iterative source separation procedure is proposed which builds upon these results. It is tested on a variety of data, both synthetic and realistic, and also with different source separation scenarios, oracle or non oracle. In terms of SIR, SAR and SDR, the method achieves better performance than consistency-based approaches. To complete the experimental analysis, sound examples are provided which allow the reader to assess the interest of the method regarding the improvement of sound quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Time-Frequency Trade-offs for Audio Source Separation with Binary Masks

The short-time Fourier transform (STFT) provides the foundation of binary-mask based audio source separation approaches. In computing a spectrogram, the STFT window size parameterizes the trade-off between time and frequency resolution. However, it is not yet known how this parameter affects the operation of the binary mask in terms of separation quality for real-world signals such as speech or...

متن کامل

STFT based Blind Separation of Underdetermined Speech Mixtures

Analysis of non stationary signals like audio, speech and biomedical signals require good resolution both in time and frequency as their spectral components are not fixed. There are many applications of time-frequency analysis in non stationary signals like source separation, signal denoising etc. This paper presents an application of time frequency analysis using STFT, Short Time Fourier Trans...

متن کامل

STFT based Blind Separation of Underdetermined Speech Mixtures

متن کامل

Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation Factorisation en matrices à coefficients positifs de données multicanal convolutives pour la séparation de sources audio

We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the Short-Time Fourier Transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is given a model inspired from nonnegativ...

متن کامل

Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction

As many acoustic signal processing methods, for example for source separation or noise canceling, operate in the magnitude spectrogram domain, the problem of reconstructing a perceptually good sounding signal from a modified magnitude spectrogram, and more generally to understand what makes a spectrogram consistent, is very important. In this article, we derive the constraints which a set of co...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Model-based STFT phase recovery for audio source separation

نویسندگان

چکیده

منابع مشابه

Time-Frequency Trade-offs for Audio Source Separation with Binary Masks

STFT based Blind Separation of Underdetermined Speech Mixtures

STFT based Blind Separation of Underdetermined Speech Mixtures

Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation Factorisation en matrices à coefficients positifs de données multicanal convolutives pour la séparation de sources audio

Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction

عنوان ژورنال:

اشتراک گذاری